Most Probable Explanations for Probabilistic Database
نویسندگان
چکیده
Probabilistic databases (PDBs) have been widely studied in the literature, as they form the foundations of large-scale probabilistic knowledge bases like NELL and Google’s Knowledge Vault. In particular, probabilistic query evaluation has been investigated intensively as a central inference mechanism. However, despite its power, query evaluation alone cannot extract all the relevant information expressed in PDBs. Inspired by the maximal posterior probability computations in Probabilistic Graphical Models (PGMs) [3], we investigate the problem of finding most probable explanations for database queries to exploit the potential of such large databases to their full extent. The most probable database [2] is the (classical) database with the largest probability that satisfies a given query. Intuitively, the query defines constraints on the data, and the goal is to find the most probable database that satisfies these constraints. We also introduce a more intricate notion, called most probable hypothesis, which is only a partial database satisfying the query. The most probable hypothesis contains only facts that contribute to the satisfaction of the query, which allows to more precisely pinpoint the most probable explanations for the query. We study the complexity of the corresponding decision problems for a variety of database query languages. In particular, we also consider ontology-mediated queries (OMQs), which enrich UCQs with the power of Datalog± ontologies. They allow us to query PDBs in a more advanced manner [1]. We show that the complexity of these problems changes significantly with the ontology languages and the complexity-theoretic assumptions. Our results provide tight complexity bounds for a multitude of Datalog± languages (which cover some Horn Description Logics).
منابع مشابه
Most Probable Explanations for Probabilistic Database Queries (Extended Abstract)
Probabilistic databases (PDBs) have been widely studied in the literature, as they form the foundations of large-scale probabilistic knowledge bases like NELL and Google’s Knowledge Vault. In particular, probabilistic query evaluation has been investigated intensively as a central inference mechanism. However, despite its power, query evaluation alone cannot extract all the relevant information...
متن کاملMost Probable Explanations for Probabilistic Database Queries
Forming the foundations of large-scale knowledge bases, probabilistic databases have been widely studied in the literature. In particular, probabilistic query evaluation has been investigated intensively as a central inference mechanism. However, despite its power, query evaluation alone cannot extract all the relevant information encompassed in large-scale knowledge bases. To exploit this pote...
متن کاملThe Most Probable Database Problem
This paper proposes a novel inference task for probabilistic databases: the most probable database (MPD) problem. The MPD is the most probable deterministic database where a given query or constraint is true. We highlight two distinctive applications, in database repair of key and dependency constraints, and in finding most probable explanations in statistical relational learning. The MPD probl...
متن کاملA Trust Based Probabilistic Method for Efficient Correctness Verification in Database Outsourcing
Correctness verification of query results is a significant challenge in database outsourcing. Most of the proposed approaches impose high overhead, which makes them impractical in real scenarios. Probabilistic approaches are proposed in order to reduce the computation overhead pertaining to the verification process. In this paper, we use the notion of trust as the basis of our probabilistic app...
متن کاملProbabilistic Abductive Logic Programming in Constraint Handling Rules
A class of Probabilistic Abductive Logic Programs (PALPs) is introduced and an implementation is developed in CHR for solving abductive problems, providing minimal explanations with their probabilities. Both all-explanations and most-probable-explanations versions are given. Compared with other probabilistic versions of abductive logic programming, the approach is characterized by higher genera...
متن کامل